Multi-frame GMM-based block quantisation of line spectral frequencies

نویسندگان

Stephen So

Kuldip K. Paliwal

چکیده

In this paper, we investigate the use of the Gaussian mixture model-based block quantiser for coding line spectral frequencies that uses multiple frames and mean squared error as the quantiser selection criterion. As a viable alternative to vector quantisers, the GMM-based block quantiser encompasses both low computational and memory requirements as well as bitrate scalability. Jointly quantising multiple frames allows the exploitation of correlation across successive frames which leads to more efficient block quantisation. The efficiency gained from joint quantisation permits the use of the mean squared error distortion criterion for cluster quantiser selection, rather than the computationally expensive spectral distortion. The distortion performance gains come at the cost of an increase in computational complexity and memory. Experiments on narrowband speech from the TIMIT database demonstrate that the multi-frame GMM-based block quantiser can achieve a spectral distortion of 1 dB at 22 bits/frame, or 21 bits/frame with some added complexity. 2005 Published by Elsevier B.V.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A comparative study of LPC parameter representations and quantisation schemes for wideband speech coding

In this paper, we provide a review of LPC parameter quantisation for wideband speech coding as well as evaluate our contributions, namely the switched split vector quantiser (SSVQ) and multi-frame GMM-based block quantiser. We also compare the performance of various quantisation schemes on the two popular LPC parameter representations: line spectral frequencies (LSFs) and immittance spectral pa...

متن کامل

Scalable distributed speech recognition using Gaussian mixture model-based block quantisation

In this paper, we investigate the use of block quantisers based on Gaussian mixture models (GMMs) for the coding of Mel frequency-warped cepstral coefficient (MFCC) features in distributed speech recognition (DSR) applications. Specifically, we consider the multi-frame scheme, where temporal correlation across MFCC frames is exploited by the Karhunen–Loève transform of the block quantiser. Comp...

متن کامل

Improved noise-robustness in distributed speech recognition via perceptually-weighted vector quantisation of filterbank energies

In this paper, we examine a coding scheme for quantising feature vectors in a distributed speech recognition environment that is more robust to noise. It consists of a vector quantiser that operates on the logarithmic filterbank energies (LFBEs). Through the use of a perceptually-weighted Euclidean distance measure, which emphasises the LFBEs that represent the spectral peaks, the vector quanti...

متن کامل

Gaussian Mixture Model-based Quantization of Line Spectral Frequencies for Adaptive Multirate Speech Codec

In this paper, we investigate the use of a Gaussian Mixture Model (GMM)-based quantizer for quantization of the Line Spectral Frequencies (LSFs) in the Adaptive Multi-Rate (AMR) speech codec. We estimate the parametric GMM model of the probability density function (pdf) for the prediction error (residual) of mean-removed LSF parameters that are used in the AMR codec for speech spectral envelope...

متن کامل

Switched split vector quantisation of line spectral frequencies for wideband speech coding

In this paper, we investigate the use of the switched split vector quantiser (SSVQ) for coding short-term spectral envelope information for wideband speech coding. The SSVQ is the hybrid of a switch vector quantiser and split vector quantiser, which has been shown in previous studies to be more efficient, in terms of rate-distortion, as well as possessing low computational complexity, than the ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Speech Communication

دوره 47 شماره

صفحات -

تاریخ انتشار 2005

Multi-frame GMM-based block quantisation of line spectral frequencies

نویسندگان

چکیده

منابع مشابه

A comparative study of LPC parameter representations and quantisation schemes for wideband speech coding

Scalable distributed speech recognition using Gaussian mixture model-based block quantisation

Improved noise-robustness in distributed speech recognition via perceptually-weighted vector quantisation of filterbank energies

Gaussian Mixture Model-based Quantization of Line Spectral Frequencies for Adaptive Multirate Speech Codec

Switched split vector quantisation of line spectral frequencies for wideband speech coding

عنوان ژورنال:

اشتراک گذاری